Results of the WMT16 Metrics Shared Task
نویسندگان
چکیده
This paper presents the results of the WMT16 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT16 Shared Translation Task. We collected scores of 16 metrics from 9 research groups. In addition to that, we computed scores of 9 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system-level correlation (how well each metric’s scores correlate with WMT16 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in comparing two translations of a particular sentence). This year there are several additions to the setup: large number of language pairs (18 in total), datasets from different domains (news, IT and medical), and different kinds of judgments: relative ranking (RR), direct assessment (DA) and HUME manual semantic judgments. Finally, generation of large number of hybrid systems was trialed for provision of more conclusive system-level metric rankings.
منابع مشابه
UHH Submission to the WMT17 Metrics Shared Task
In this paper the UHH submission to the WMT17 Metrics Shared Task is presented, which is based on sequence and tree kernel functions applied to the reference and candidate translations. In addition we also explore the effect of applying the kernel functions on the source sentence and a back-translation of the MT output, but also on the pair composed of the candidate translation and a pseudo-ref...
متن کاملBlend: a Novel Combined MT Metric Based on Direct Assessment - CASICT-DCU submission to WMT17 Metrics Task
Existing metrics to evaluate the quality of Machine Translation hypotheses take different perspectives into account. DPMFcomb, a metric combining the merits of a range of metrics, achieved the best performance for evaluation of to-English language pairs in the previous two years of WMT Metrics Shared Tasks. This year, we submit a novel combined metric, Blend, to WMT17 Metrics task. Compared to ...
متن کاملEdinburgh's Statistical Machine Translation Systems for WMT16
This paper describes the University of Edinburgh’s phrase-based and syntax-based submissions to the shared translation tasks of the ACL 2016 First Conference on Machine Translation (WMT16). We submitted five phrase-based and five syntaxbased systems for the news task, plus one phrase-based system for the biomedical task.
متن کاملResults of the WMT16 Tuning Shared Task
This paper presents the results of the WMT16 Tuning Shared Task. We provided the participants of this task with a complete machine translation system and asked them to tune its internal parameters (feature weights). The tuned systems were used to translate the test set and the outputs were manually ranked for translation quality. We received 4 submissions in the Czech-English and 8 in the Engli...
متن کاملFindings of the WMT 2016 Bilingual Document Alignment Shared Task
This paper presents the results of the WMT16 Bilingual Document Alignment Shared Task. Given crawls of web sites, we asked participants to align documents that are translations of each other. 11 research groups submitted 19 systems, with a top performance of 95.0%.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014